Binary Feature Selection with Conditional Mutual Information

نویسنده

  • François Fleuret
چکیده

In a context of classi cation, we propose to use conditional mutual information to select a family of binary features which are individually discriminating and weakly dependent. We show that on a task of image classi cation, despite its simplicity, a naive Bayesian classi er based on features selected with this Conditional Mutual Information Maximization (CMIM) criterion performs as well as a classi er built with AdaBoost. We also show that this classi cation method is more robust than boosting when trained on a noisy data set. Key-words: classi cation, feature selection, Bayesian classi er, mutual information Sélection de descripteurs par maximisation de l'information mutuelle conditionnelle Résumé : Dans un contexte de classi cation, nous proposons d'utiliser l'information mutuelle conditionnelle pour sélectionner une famille de descripteurs binaires qui sont individuellement informatifs tout en étant faiblement dépendants entre eux. Nous montrons sur un problème de classi cation d'images que malgré sa simplicité un classi eur de type Bayésien naïf utilisant des descripteurs sélectionnés de cette manière obtient des taux d'erreur similaires à ceux d'un classi eur construit à l'aide d'AdaBoost. Nous montrons également que cette technique est beaucoup plus robuste que le boosting dans un cadre bruité. Mots-clés : classi cation, sélection de features, classi eur bayésien, information mutuelle CMIM feature selection 3

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Binary Feature Selection with Conditional Mutual Information

We propose in this paper a very fast feature selection technique based on conditional mutual information. By picking features which maximize their mutual information with the class to predict conditional to any feature already picked, it ensures the selection of features which are both individually informative and two-by-two weakly dependant. We show that this feature selection method outperfor...

متن کامل

Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information

This study introduces a novel feature selection approach CMICOT, which is a further evolution of filter methods with sequential forward selection (SFS) whose scoring functions are based on conditional mutual information (MI). We state and study a novel saddle point (max-min) optimization problem to build a scoring function that is able to identify joint interactions between several features. Th...

متن کامل

Learning LBP structure by maximizing the conditional mutual information

Local binary patterns of more bits extracted in a large structure have shown promising results in visual recognition applications. This results in very highdimensional data so that it is not feasible to directly extract features from the LBP histogram, especially for a large-scale database. Instead of extracting features from the LBP histogram, we propose a new approach to learn discriminative ...

متن کامل

Conditional Mutual Information - Based Feature Selection Analyzing for Synergy and Redundancy

© 2011 ETRI Journal, Volume 33, Number 2, April 2011 Battiti’s mutual information feature selector (MIFS) and its variant algorithms are used for many classification applications. Since they ignore feature synergy, MIFS and its variants may cause a big bias when features are combined to cooperate together. Besides, MIFS and its variants estimate feature redundancy regardless of the correspondin...

متن کامل

Conditional Mutual Information Based Feature Selection for Classification Task

We propose a sequential forward feature selection method to find a subset of features that are most relevant to the classification task. Our approach uses novel estimation of the conditional mutual information between candidate feature and classes, given a subset of already selected features which is utilized as a classifier independent criterion for evaluation of feature subsets. The proposed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003